perm filename SCENAR.MSS[RDG,DBL]1 blob sn#644394 filedate 1982-03-02 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00018 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	@Device[DOVER]
C00006 00003	@Begin[Center]
C00013 00004	@Chapter(Actual Dialogue)
C00021 00005	@\@i{<This approach leads to a reformulation of the description of a file.
C00026 00006	@\@i{<Ok, so not we can readily tell, symbolically, what happens when M-F and
C00031 00007	KA:@\ Ok, I have M-F, M-B, M-D and M-<rubout> under my belt. What now?
C00040 00008	@Chapter(Miscellaneous Editor Analogies)
C00047 00009	@Chapter(Predictions)
C00050 00010	@Section(Thoughts on this task/domain choice/...)
C00052 00011	@Chapter(Further tasks)
C00060 00012	@Chapter(Conclusion)
C00063 00013	@Appendix(For What Type of Pupil Is Analogy Appropriate?)
C00073 00014	@AppendixSec(Scrutiny)
C00080 00015	@AppendixSec(Other cases)
C00083 00016	@Appendix(Starting Set of Rules)
C00086 00017	@Appendix<(Starting) Description of Editors>
C00090 00018	@Appendix(Evaluating Body of Test Cases).
C00091 ENDMK
C⊗;
@Device[DOVER]
@Make[Report]

@DefineFont(BodyFont,	
	R=<Xerox "TimesRoman10R">,
	I=<Xerox "TimesRoman10I">,
	B=<Xerox "TimesRoman10B">,
	P=<Ascii "TimesRoman10BI">,
	F=<Xerox1 "Gacha10R">,
	C=<Xerox "TimesRoman8R">,
	Z=<Ascii "MATH10R">,
	G=<Xgreek "Hippo10R">,
	Y=<Xerox "TimesRoman8R">,
	X=<Xerox "Symbol10R">,
	U=<Ascii "Sail10">,
	T=<Xerox1 "Gacha10B">)

@DefineFont(SmallBodyFont,
	R=<Xerox "TimesRoman8R">,
	B=<Xerox "TimesRoman8B">,
	I=<Xerox "TimesRoman8I">,
	P=<Xerox "TimesRoman8BI">,
	G=<Xgreek "Hippo8R">,
	F=<Xerox1 "Gacha8R">,
	Z=<Ascii "MATH8R">,
	Y=<Xerox "TimesRoman8R">,
	X=<Xerox "Symbol10R">,
	U=<Ascii "Sail8">,
	C=<Xerox "TimesRoman6R">,
	T=<Xerox1 "Gacha8B">)
@Modify[Verbatim, Break Before]
@Modify[Description, Spread 0, Spacing 1.3]
@Modify[Quotation, Indent 0]
@Use(Bibliography = "GENL.BIB[RDG,DBL]")
@Use(Bibliography = "REPN.BIB[RDG,DBL]")
@Use(Bibliography = "META4.BIB[RDG,DBL]")
@Style[Indent 0]
@Style[References=STD Alphabetic]
@DEFINE[Aside=NoteStyle, LeftMargin -16, Indent 0]
@DEFINE[SubAside=Quotation,Font SmallBodyFont,FaceCode I,Spacing 1,Spread 0.5]
@DEFINE[Subsubsection,Use HdX,Font TitleFont3,FaceCode I,Above .3inch,Centered]
@DEFINE[ENUM1=ENUMERATE, NumberFrom 0, SPREAD=0]
@SET[Page=1]

@EQUATE[YY=R]
@SpecialFont[F1=TimesRoman14]
@Font[TimesRoman10]

@Counter(EquationCounter, Numbered <(Facts @1)>, Referenced <(Facts @1)>,
IncrementedBy tag, INIT 0)
@Begin[Center]
@F1[Using Analogies For Knowledge Acquisition]
@i(A Scenario)
Russell Greiner
@End[Center]

This short scenario will demonstrate some of the ways
analogy can be used to facilitate a Knowledge Acquisition (KA) task.
The examples included show
the type of reasoning (we consider) necessary for this process.

@i{<It is worth mentioning early that this KA module will NOT generate analogies.
Its goal is to understand and use the ones it is given.
Also, note the importance of the interactive nature of this dialogue --
this program will first conjecture an interpretation of the analogy,
and can then ask questions along to attempt to confirm this...>}

@Section(Overview of Situation and Goal)

The main characters in this story are
a hypothetical expert system, ES,
and an (assumedly human) expert on EMACS, U.
Their mutual goal is to increase ES's "understanding" of EMACS;
achieved by teaching it more EMACS commands,
and by improving its understanding of previously known commands --
so ES will "know" better when to apply a given command,
and precisely what it will do.
(At a meta-level,)
our research goal is understanding how to use analogies to facilitate
this educational process.

For ES to profitably understand the analogies U may use,
it must initially "know" a little about EMACS, 
(i.e. a few of its basic commands)
and something about editors in general
-- including a description their purpose,
what sorts of things they are able to do, 
a body of relevant vocabulary terms, etc.
(Appendix @Ref(StartingInfo) justifies this claim, 
explaining why ES must start with some core knowledge, but not too much.)

ES has access to two (complementary) forms
of knowledge about editors and EMACS.
First, it includes ("procedural") rules,
which ES's inference engine can directly use to solve particular problems.
(These solutions will be in the commands to type into EMACS.)
Appendix @Ref(Rules) lists a subset of these problem solving rules.
ES also has a knowledge base of ("declarative") body of facts about 
editors in general,
instantiated with some specific facts about (what it now knows about) EMACS.
(A sketch of this semantic network is included in Appendix @Ref(EditorFacts).)

Both of the Knowledge Bases -- of rules and facts --
will be modified by the KA, the Knowledge Acquisition front-end to ES.
In addition to incorporating new rules/nodes/links,
KA may alter existing facts.
For example, we will see cases where an If clause of a rule is generalized,
and where the "topology" of a network is rearranged.
(Note that KA will not touch certain things 
-- like the rule interpreter and other control procedures.)

We've been rather flippant with our use of the term "understanding" --
just what does it mean to claim that ES now "understands" a new command?
Rather than follow a rather unnatural, and complicated structural
approach@Foot{
That is, we might insist that ES be able to produce the TECO code
corresponding to each new command.
(Note this is similar to a "derivation from first principles".)
However, this is more than most people could do, and not necessary.}
we will use a totally behavior criterion:
does ES use the correct commands in a given situation?
We will say ES has incorporated a new command if it uses
that command appropriately 
-- meaning that ES types that command when an EMACS expert,
(trying to minimize his keystrokes,)
would type it.

Appendix @Ref(TestBed) lists a small
battery of test questions,
generated before beginning this KA task.
Note we're still not sure just how these
questions should be posed -- that is, in what language to describe 
the initial and desired file (and other things, like current windows,
cursor position, etc).
One possibility is to design a new high-level editor-independent specification --
whatever that would be.
Another would be to type the appropriate E or BRAVO commands, and have them
translated into EMACS.
Comments?
@Chapter(Actual Dialogue)

Reiterating the above sitution:
An EMACS expert, U,
is using the Knowledge Acquisition program, KA,
to teach an expert system, ES, 
more about the editor EMACS.
ES's knowledge is contained in the two Knowledge Bases - RULES and EDITORS.
KA has access to both of these KBs, as well as basic facts about the nature
of analogies, etc.  (See [Greiner - Thesis Proposal, ca 12/81]).
Our goal is to figure how U can profitably use analogies in his descriptions
of new commands and classes of commands.

@i<The dialogue is rather Anglisized -- this is just to help me communicate
the core ideas... I am @B(NOT) planning to build a sophisticated Natural Language
front end.  Also, realize I am @B(NOT) claiming to know how to go about achieving
the various impressive derivations which KA is able to perform below.
A de-cutesied dialogue, based on the below discourse, should be considered
my aspiration -- achieving it will require a great deal more research, of course.>

-----
@BEGIN(Description)
U:@\ Tell me what you now know about EMACS.

KA:@\ I know about simple character movement/modification commands, and some 
file interface commands -- including how to visit a file and to save the
changes made.

U:@\ Let me tell you about word commands:
They are just like single-character commands,
EXCEPT you hold down the Meta key rather than the Control key.

KA:@\ I currently know about two classes of single-character commands:
single-character movement and character-modification.
Does this transformation apply to both of them?

U:@\ Yes.

@BEGIN(Multiple)
KA:@\ I know about the following character movement commands:
@BEGIN(Example)
C-F -- move the cursor forward one *character*
C-B -- move the cursor backward one *character*
@END(Example)
and the following character modification commands:
@BEGIN(Example)
C-D -- delete the *character* the cursor is pointing to
C-<rubout> -- delete the *character* immediately before the cursor
@END(Example)

Can I assume that
@BEGIN(Example)
single character : C-F :: word : M-F?
@END(Example)
(Is this true for each of the 
C-F/M-F, C-B/M-B, C-D/M-D and C-<rubout>/M-<rubout> pairs?)
@END(Multiple)

U:@\ Yes.

KA:@\ Does it make sense to "move forward one WORD"?

U:@\ Yes.

KA:@\ Does the user type M-F to move forward one word?

U:@\ Yes.

KA:@\ Are the following commands correct?
@BEGIN(Example)
	M-F -- move the cursor forward one *word*
	M-B -- move the cursor backward one *word*
@END(Example)
and
@BEGIN(Example)
	M-D -- delete the *word* the cursor is pointing to
	M-<rubout> -- delete the *word* immediately before the cursor.
@END(Example)

U:@\ Yes.

@BEGIN(Multiple)
KA:@\ Ok, I now have the "syntax".
At the risk of boring you, let me show you what I know now.
I analogized from

@BEGIN(Example)
MoveForwardSingleChar
   Isa:		CharacterMovementCommand
   Direction:	Forward
   #Chars:	1
   TypedChars:	/ Character:	"F"
		\ Bucky-Bit:	Control
   Boundary:	<<If at end of file, NoOp>>
@END(Example)

to

@BEGIN(Example)
MoveForwardSingleWord
   Isa:		CharacterMovementCommand
   Direction:	Forward
   #Chars:	?
   TypedChars:	/ Character:	"F"
		\ Bucky-Bit:	Meta
   Boundary:	?
@END(Example)
)@Foot{Notice that the terms used here are making a rather
strong assumption:
@i<viz.> that any command can be decomposed into two parts 
-- the character and its "bucky bits".
This issue, of vocabulary, is reiterated in Appendix @Ref(Exam).}

Ignoring the as-of-yet unspecified "?"s, are there any mistakes so far?
@END(Multiple)

U:@\ No.

KA:@\ The other cases seem rather similar... Must I show them also?
(Note they all have a "?" on their respective #Chars slots.)

U:@\ No.

@BEGIN(Multiple)
KA:@\ Thanks.  Onto the semantics of this command:
The rules relevant to these @B{CharacterMovementCommand} indicate that executing
this M-F command will move the cursor Forward (using the @i{Direction} slot) by
@B{MoveForwardSingleWord}:@i{#Chars} characters (ignoring boundary conditions).
That is, M-F means
@BEGIN(Example)
Cursor-Position := Cursor-Position + MoveForwardSingleWord:#Chars.
@END(Example)

Is this value, @B{MoveForwardSingleWord}:@i{#Chars}, a constant?@Foot{
Note that the file is being represented as an array.
In this case, we'll assume this is the only representation the system knew.
In general, there may be other descriptions -- and some work may be required
to decide which of these to use.}
@END(Multiple)

U:@\ No.

KA:@\ How do I compute it?

U:@\ By moving enough character to reach the end of a word.

KA:@\ What is a word?@Foot{
Realize if KA already knew about words, we'd be done.  The "distance" slot
above would be trivially filled in, with (essentially) a function which
advanced to the next word break.  
This "analogy" would then be quite uninteresting ... mere character substitution
would do...}

U:@\ A string of characters, delimited by some word seperator (like space or 
punctuation.)

@\@i{<This approach leads to a reformulation of the description of a file.
I've only the foggiest of notions how to automate this process.>}

KA:@\ Ok.  Do these "word"s form a partitioning of the file?
That is, can any sequence of character be divided into a sequence of words?

U:@\ Yes.

@BEGIN(Multiple)
KA:@\ (Thinking aloud) So I can describe the cursor position in a file as
a pair, @i{<word-posn, char-posn>} -- meaning the cursor is currently in 
@i{char-posn} characters in front of the @i{word-posn}-th word
of the file.

In this representation, considering Cursor-Position now as a 2-D vector,
C-F means
@BEGIN(Example)
	Cursor-Position := Cursor-Position + <0,1>,
@END(Example)
(except when the cursor had been at the end of a word)
and M-F means
@BEGIN(Example)
	Cursor-Position := Cursor-Position + <1,0>.
@END(Example)

Then the @i{#Chars} slot should be "extended" to the @i{Delta};
and @B{MoveForwardSingleChar}:@i{Delta} = <0,1>, and
MoveForwardSingleWord:Delta = <1,0>.

Is this correct?
@END(Multiple)

U:@\ Well, it might have been, but it isn't.@Foot{
Note this is what happens with line movement -- one remains the same number
of characters from the beginning after a C-N or C-P -- or as close to that
position as possible.}
To explain this command I will need to make
the defintion of "word" more precise.
it is a string of 1 or more consecutive "regular" (i.e. non-delimiter) characters,
followed by a string of 1 or more word delimiter characters.
The M-F command always places the cursor at the first delimiter following
the current (or next) patch of regular characters.
M-B always places the cursor at the first regular character of a word.
In general it is within the current word -- unless you're already at the
start of a word when the command is issued.

@BEGIN(Multiple)
KA:@\ Hmm. So perhaps an appropriate representation of the cursor position is a
triple:
@Example{<word-posn, delimitor?, char-posn>,}
where @i{word-posn} has the same meaning it had above, 
@i{delimitor?} is a bit, indicating whether the cursor is now in
the midst of delimiters or regular characters,
and @i{char-posn} tells how many characters from the start of this string.

(Note I had to use the fact that regular and delimiter characters are disjoint.)

I got the following from what you said:
M-F maps

@BEGIN(Example)
<N, 0, M> => <N, 1, 0>
<N, 1, M> => <N+1, 1, 0>;
	[Note non-terminals are capitalized; and 0 in the 2nd position means
	 the cursor is pointing to a regular character, 1 means to a delimiter.]
@END(Example)
and M-B maps

@BEGIN(Example)
<N, 0, 0> => <N-1, 0, 0>
<N, 0, M> => <N, 0, 0>		[provided M>0]
<N, 1, P> => <N, 0, 0>?
@END(Example)

Is this correct?
@END(Multiple)

U:@\ Yes.

@\@i{<Ok, so not we can readily tell, symbolically, what happens when M-F and
M-B are typed.  We still need an algorithmic description of this.
Furthermore, we still never said when to apply this command.>}

@BEGIN(Multiple)
KA:@\ So the M-F algorithm is

@BEGIN(ProgramExample)
WHILE ~Delimiter( Current-Char ) DO Advance-Char();
WHILE  Delimiter( Current-Char ) DO Advance-Char();
Advance-Char();
@END(ProgramExample)

and M-B is

@BEGIN(ProgramExample)
BackUp-Char();
WHILE  Delimiter( Current-Char ) DO BackUp-Char();
WHILE ~Delimiter( Current-Char ) DO BackUp-Char();
@END(ProgramExample)

where Current-Char is the current character in the file,
and the Advance-Char() function advances the cursor one position,
while BackUp-Char() moves the cursor back one position.@Foot{
This gets a bit messy around the beginning and end of a file --
We're save if there are SOF and EOF (delimitor) characters in those positions.}
@END(Multiple)

U:@\ Looks good.

KA:@\ I still have to deduce when to actually use these commands.
As a word has many letter, I assume I should use the single M-F command
rather than a bunch C-F, whenever possible.@Foot{
There are, of course, many nuances of this --
for example, to reach the 10 character of an 11 character word,
one would probably advance one word,
then back up a single character, rather than advance ten characters.}
Can I assume these delimiters are easy (for you humans) to see?

U:@\ Yep.

@\@i{<Now what about the other two commands - M-D, M-<rubout>?>}

KA:@\ Tell me more about M-D and M-<rubout>.

U:@\ Consider all the characters from the current position up to where 
M-F would place the cursor.  
M-D deletes all of these characters.
Similarly M-<rubout> deletes all of the characters from the current position
up to where M-B would have placed the cursor.@Foot{
Once might say M-F : M-D :: M-B : M-<rubout>, in this sense.}

@BEGIN(Multiple)
KA:@\ Ok, so M-D does the following

@BEGIN(ProgramExample)
WHILE ~Delimiter( Current-Char ) DO Delete-Char();
WHILE  Delimiter( Current-Char ) DO Delete-Char();
Advance-Char();
@END(ProgramExample)

and M-<rubout> performs

@BEGIN(ProgramExample)
BackUp-Char();
WHILE  Delimiter( Current-Char ) DO <Delete-Char(); BackUp-Char()>;
WHILE ~Delimiter( Current-Char ) DO <Delete-Char(); BackUp-Char()>
@END(ProgramExample)

where Delete-Char() deletes the current character, in effect advancing the cursor.

Correct?
@END(Multiple)

U:@\ Of course.

KA:@\ And I'll use this whenever possible, as I would M-F to move?

U:@\ Yes, for the time being. 
(As these deletions are destructive, this could be dangerous.  Fortunately
KA doesn't yet know about things like advancing too far, then backing up...)

KA:@\ Ok, I have M-F, M-B, M-D and M-<rubout> under my belt. What now?

U:@\ S-expression commands are just like WORD commands, except they
use M-C-@i{x} rather than M-@i{x}.@Foot{
Note this is better than saying they are just like character commands,
as an S-expressions is more like a word than a single character 
(in that both words and S-expressions have many characters).}

KA:@\ So I can assume there are the following commands?
@BEGIN(Example)
M-C-F -- move the cursor forward one *S-expression*
M-C-B -- move the cursor backward one *S-expression*
M-C-D -- delete the *S-expression* the cursor is pointing to
M-C-<rubout> -- delete the *S-expression* immediately before the cursor
@END(Example)

U:@\ Yes.

KA:@\ What's an S-expression?

U:@\ An S-expression is a string of characters, (recursively) defined as follows:
it is either an atom, or an opening parenthesis, "(", followed by a sequence
of atoms, seperated by spaces, and then a closing parethesis.

KA:@\ The obvious question: what is an atom?

U:@\ (For the time being, assume) it is a word.

KA:@\ So 

<<<<here>>>>

@END(Description)
@Chapter(Miscellaneous Editor Analogies)

Related EMACS commands:

@BEGIN(Verbatim)
Word commands are just like character commands,
	EXCEPT they use @i{M-x} rather than @i{C-x}.

S-expression commands are just like word commands,
	EXCEPT they use @i{M-C-x} rather than @i{M-x}.

Sentences are handled like lines,
	EXCEPT the command is @i{M-x} rather that @i{C-x},
	AND these commands refer to next sentence, not same line.
(recall line commands are for this line, and stay in place (idempotent)).

Buffers are like File Visiting,
	EXCEPT @i{C-X B} to switch about files (info lost when visiting)
	AND the file is not lost when next file is "encountered".

Windows are handled like Buffers,
	EXCEPT they are visible (and so have fewer lines...)
	AND they use @i{C-X C-O/1/2} to switch between them, (not @i{C-X B})

@END(Verbatim)

EMACS to E differences

@BEGIN(Verbatim)
EMACS has mode -- as it is adaptable (note E is not...)
(similarly for sub-modes, user profiles, ...)

@G(a)> like C-U C-N -- now makes sense to find other E commands which correspond to
C-U <x>.  Besides α<, there aren't any...

Note <alt> n C-V moves screen up n chars,
whereas @G(a)n<form> moves forward n screenfuls.

@G(a)N and C-X C-X (return to last mark) both go to other side of buffer, once it
has been deposited.
@END(Verbatim)
@Chapter(Predictions)

So far the analogies had a single use -- for U to communicate a bundle
of facts to ES, simply and quickly.
In the scenario given, the full derivation which KA had to produce to
note such connections might simply be thrown away after the result
(eg how to move forward one word) has been deduced.

There are (at least) one reason to maintain this information:
It could be used by KA to draw feasible inferences,
based on what it has seen so far.  
These plausible inferences would suggest other analogous connections
of commands,
based on those old analogies.
For example, if KA later found that C-T meant to
transpose the (single) characters in front of and behind the cursor,
it might ask whether M-T meant to transpose words.
Indeed this the case.
@Foot{This implies some method to this EMACS madness.
<Here: Designed artifact assumption.>}

Eventually systems of known analogies could be analyzed,
searching for regularities.
One observation is that C- seems to refer to lowest (physical) level,
and M- to the next (often logical) level.  
(Not only was this true in the characters/words case,
but also for lines/sentences.)
From this, one might infer that, as C-N goes up one line,
that M-N would go up, say, one paragraph. (Note this is faulty --
oh well.)
<<Need better Example>>

@Section(Thoughts on this task/domain choice/...)

This particular domain, of Editors, is clearly artificial.
"Understanding" EMACS requires a version of "designed artifact" notion:
the parts were designed for some reason; and if we can (re)construct that
reason, we'll be able to predict other aspects of the system -- ie other
parts probably used same type of justification...  Here, a reason might
be "ease of memory" (hence similar names for similar commands, and use nice
abbreviation (abbreviations for ease of use-- fewer keystrokes))
or commonly used instructions should be short, and concise (ie single
command).  Pedagogy - need to know certain basic ops for any editor. (See ?)

The role of this system is vaguely similar (analogous) to Davis' Teiresias
@Cite(TIER) --
helping the user to input new facts, possibly finding errors, (using, in this
case, some crude semantic information as well as the syntactic constraints).


@Chapter(Further tasks)

Now knowing about EMACS, try to learn E.
@BEGIN(Itemize)
@BEGIN(Multiple)
(As with EMACS,)
when two E commands perform a similar function,
expect their invocations to be similar.@*
[Deep reasoning: designed by humans for humans...]@*
Ex: @G(a)_ & @G(ab)_ related in E, (as are @G(a)_ and @G(a)X_,)
as C- and M- are in EMACS@*
(Consider @G(a)F and @G(a)XF, or @G(a)D and @G(ab)D...@*

[Note: this is not always the case - consider programming languages,
in which "WHILE ... DO ..." is quite different from the similar
"REPEAT ... UNTIL ...".]
@END(Multiple)

Same types of commands in E as in EMACS 
-- cursor movement, text substitution, movement, insertion...@*
[Deep reason: both are editors,
and the purpose of an editor is to perform some function...]

Some forms -- especially those which lead to an obscure function --
require many keystrokes.  In E, this is done by @G(a)X ___, which
is similar to M-X ___ in EMACS.@*
[Deep reason: all commands cannot be single character...]

Pages in E are like narrow window in EMACS 
-- serving to restrict search, general context, ...@*
[This shortens otherwise long searches, movements, ...]@*
Difference: E's pages are static, defined in a file,
whereas narrow windows are dynamically assigned in EMACS.
@END(Itemize)
 
Onto harder problems --
It is relatively easy to map from EMACS to Bravo or E 
-- as they are all full screen editors.
What about TECO or SOS?
(Note this TTY editors make quite different assumptions:
here, typing is expensive, and should be avoided.  Hence the user must
explicitly state he wants to see X before seeing it.)

Go from here to -- knowledge about Text Editors -- to more general knowledge
about Text Formatters, like TEX or SCRIBE.  Core facts, about words, spaces,
etc, map over, but (few if any) actual commands will.

On another dimension,
consider the EMACS to representation languages mapping.
Both attempt to facilitate the storage and retrieval of information.
The major difference is generality: RLs are not restricted to dealing with text.

Consider now programming languages in general
-- which, like editors, are also based on sequences of commands, etc.

What about editors to real world map?
Files are like filing cabinets,
words like sequences of actions, ... etc.

(Briefly, one might find classifications like this in almost any other area as well:
consider music recognition/appreciation:
@BEGIN(Example)
Same piece, different instrumentalists/conduction
	[Preserves notes, timing, etc]
Same piece, different arrangement (for other instruments)
	[Preserves contour of notes, timing, etc]
Same composer, same time of life (perhaps different movements in same peice)
	[Preserves ideas, defn of musicalness, etc]
Same composer, different time of life
	[Preserves "style", derived themes, ability, etc]
Same time period (eg baroque, or gregorian)
	[Preserves overall form (eg acceptability), some tonalities etc]
Same tonality (eg western, or oriental -- more local: slavic)
	[Preserves musical function, ...]
@END(Example)

--another dimension - same instruments, ...

-----
Or, within the domain of computer languages
-- consider InterLisp to MacLisp mapping, or to APL (another interpreter)
or to Pascal (reasonably well designed language) or
to Fortran (well, must have some of the same ideas).
@Chapter(Conclusion)

People seem amazing adept at transfering "knowledge" or "expertise" from
one field or task domain to another, similar one.
For example, it is considerably easier to learn a second editor than it was
to learn the first; likewise for the second programming language, or second
instrument, etc.
(This obvious fact has indeed been empirically verified, by @Cite(Rumelhart).)

Why is this?
Clearly much of the information one assimilates when learning EMACS is
about the concept of what an editor does, and about editors in general.
This general information provides an important framework/structure off of which
to hang the "new" facts about that second editor --
hence much of the Bravo facts can be viewed simply as instances of those
already understood general editor facts.

When this structure is explicitly known, (and exactly apt,)
this learning task can be considered simple instantiation.
However the student is usually unable to articulate this information --
it has been "compiled" into his competence with editors, or whatver.
In addition, there are times when the old facts -- when that existent
structure -- is not correct.  For example, E makes great use of pages,
which EMACS just barely knows about.

To use that old information, in both of these cases, requires analogical
thinking -- the ability to (i) reason from the known examples of EMACS in action
to understand what its sibling E is trying to do, and (ii) the ability to
construct a new framework from a slightly-inapplicable one, to handle the
new case.

This scenario demonstrated both forms of analogical reasoning at play.
Much research is required before any of this processing will be automated.
That is the basic goal of my research.

--- and its use in general for creating and updating the KBs an ES will need
to use.
@Appendix(For What Type of Pupil Is Analogy Appropriate?)
@Tag(StartingInfo)

Ask yourself how you would teach EMACS to a person.
In particular, how would you alter your presentation based on
what your student already knows?
Whether you will be able
to (profitably) use analogy as a teaching tool should
depend heavily on his current state of knowledge.

To demonstrate this point, we present three vignettes, 
all on the theme of learning EMACS.
In all three cases,
the student enters knowing the various single character commands --
including single character movements (C-F, C-B) 
and single character modifications (C-D, C-<rubout>, insertion).
He is then told that
@BEGIN(Example)
(*)	WORD commands are just like CHARACTER commands, EXCEPT they use
	"Meta-" rather than "Control-".
@END(Example)
The accounts below will differ only in terms of the learner's prior knowledge.

I. Mr RankBeginner (RB for short)
knows nothing about text editors, and little or nothing about
computers in general.  
Learning those single character commands was a slow
and painful process; but he can now just barely
navigate about the file, and effect those changes he desires.

I claim he would find (*) rather confusing.
While it is easy to syntactically transform his internal rules -- from
@BEGIN(Example)
	"To move forward one character, type C-F"	(i)
@END(Example)
to
@BEGIN(Example)
	"To move forward one word, type M-F",		(ii)
@END(Example)
he would have no idea when to use this new M-F command.
Given that terms like "word" are novel and alien,
how could he know what it means to "move forward one word"?  
While RB has some notion of where the cursor is,
that definition is with respect to characters;
and some non-trivial reasoning would be required to map
that information over to understand what it means for the cursor to point
to the current word.

Of course the familiar interpretations of "word",
and the intuitive notion of current word position would provide him with
a good start at understanding this command.@Foot{
We will claim that a user "understands" a command if he is able to use
it appropriately -- defined using a battery of exercising tests.
See Appendix @Ref(TestBed).}
It does, however, lead to other questions.  
What does it mean to delete the previous word (which is M-<rubout>'s function)?
In particular, what happens when the cursor is pointing to a character
inside a word?

What about C-P and C-N?
Intuitively these commands say go up or down A SINGLE CHARACTER.
It is reasonable to ask if there are M-P and M-N commands;
and if so, what do they mean?

Etc, etc, etc. 
Without a few explicit examples in hand, RB would probably simply
ignore this "word" frill, and continue using the tried-and-true C-x commands
he understands.

II. Mr BeenThereBefore (BTB) is more computer-mature.
He has (expertly) used several editors before,
and from these experiences derived a good intuitive 
feel for what sort of things an editor is supposed to do,
and how it goes about performing these tasks.

In particular, he already has a framework dedicated to 
"types of movement commands",
which currently holds only the known C-F and C-B commands.
On hearing (*), BTB's task is relatively straightforward --
he needs only store the new M-F and M-B commands at the 
appropriate place in this structure.
(M-B and M-<rubout> fit into his pre-existing text-modification framework,
of course.)
As BTB already has a basic definition of "word"
(derived from his knowledge of editors in general,)
he can readily deduce (an approximation to) the semantics of these commands.
Finally, BTB's internal rules, which tell how and when to move about the file,
are more sophisticated -- instead of RB's 
@BEGIN(Example)
	"To move forward one character, type C-F",		(i)
@END(Example)
BTB has something like
@BEGIN(Example)
	"To move forward to a new position,
		use the largest movement command available."	(iii)
@END(Example)
From this, and facts which state that words are larger than single characters
(together with certain implicit facts 
-- for example, that our visual system allows us people to find word boundaries
very rapidly),
BTB can readily incorporate these new M-x commands.

Note BTB still has some "boundary type" questions to resolve.
He does not yet know precisely what constitutes a word in this system,
nor does he know the details of how these commands will behave.
What distinguishes his case from RB's is that 
BTB knows how to go about answering 
(or perhaps, guessing at the answers to)
these question:
First, he can make quite reasonable predications based
on what other editors (eg E) would do here 
-- refined by his knowledge of the qualitative differences between
those other known editors and EMACS.
(For example, EMACS is a character-based editor, while E is line-oriented.)
If BTB needs a more exact answer, he can perform simple experiments.
Note this requires that he make the (to him) obvious assumption that
there is both a method and a consistently to these commands --
a realization that poor RB would have had trouble accepting.
Another approach begins by figuring out how to find the exact documentation,
and reading that description.@Foot{
This deals with a sense of closure - or rather, of achieving it quickly
by virtue of having the proper slots and organizing framework.
Realize too BTB may find his existent frameword lacking, and have to fix it up.}

III. Our third inquirer is Mr KnowItAll (KIA). 
In addition to BTB's background, he also knows TECO thoroughly, and,
furthermore, has access to the EMACS source code.
The analogy offered above is almost pointless,
as KIA could have derived everything he needed from "first principles"
-- from facts about editors in general, and from the "definition" of EMACS.
He would, at best, appreciate being given this pointer,
which will help focus his search when next he 
seeks other cursor movement commands.
Like the slave in the Meno,
he has not have been taught anything new 
-- only provided with a modicum of direction.
@AppendixSec(Scrutiny)
@Tag(Exam)

Let's now look more closely at these cases.

In case (I),
I imagine the analogy would go over RB's head -- or, at least, 
hd would have to be both very sharp, and quite diligent, to get anything from (*).
To fully utilize this new information, he would have to first generate
(something isomorphic to) the organizing heirarchy which both BTB and KIA
had from the start.
How could RB possibly know when to use M-F, and when C-F, unless he
had general cursor movement rules, like (iii)?
(Notice (iii)'s conclusion triggers other rules; and it is these which,
in turn, invoke a M-F or C-F.)
Rules like (iii) are pretty easy to generate once one has 
realized that both of those commands are designed to achieve similar results 
(i.e. forward cursor positioning).
It is through observations like this that the heirarchy is generated.@Foot{
So far we have included only Forward Direct Cursor Movements.
Once C-B and M-B are examined, it will be obvious these Backward Direct Cursor
Movements should be included as well. Eventually C-S search commands will be
added as well.
Realize this hierarchy establishes a particular, non-obvious cut at the world
-- along the functionality lines
(eg cursor movement) rather than keystrokes, which would have lumped M-F with
M-D.
Anyone with experience with editors will agree that this dimension is appropriate.}
Many expert system developers have commented that a large part of expertise
derives from having these appropriate structures around.

KIA, in Case (III), made essentially no use the analogy presented.
This points to the obvious observation on the role of analogical reasoning:
It should be used only when the more powerful, and/or more reliable, methods
are NOT available.
Of course, in this extreme case, 
KIA's knowledge of "first principles" obviated his need to be taught at all.

Consider the less exaggerated example,
where the user has learned that M-5 C-F will move the cursor
forward 5 characters,
and now wants to move the cursor forward by 7 characters.
It seems unproductive to tell him that M-7 C-F is analogous to M-5 C-F,
just replacing the 5 by 7 in both the command to type,
and in the resultant goal.
There is an obvious partition of the M-5 C-F command into two parts 
-- the repetition factor followed by the "primitive" command.  
One can infer from this that any number, <N>, can by substituted for the 5,
and cause the cursor to move forward that number of characters.

Analogy is not needed in this case, as we had a stronger infering strategy
-- of assigning semantics to individual parts of the command string.
Such decomposition analysis will always be preferable to analogical reasoning --
whenever it is possible.
(Ie we had to know that commands can be divided into independent parts.
Of course this is not always true -- consider ? command, where a C-U argument
means, ... or other types of commands which require long strings of characters
(as in M-X Query Replace<alt>...).)

Another example of this knowing-too-much problem is less clear,
but follows along the lines developed earlier in this paper.
Realize that KIA already "knows" EMACS's definition of words,
and hence merely has to hear the particular name for the 
"Move One Word Forward" command to insert that
into his master structure.
All of the essential details of the 
"semantics" of this command easily fall out using the obvious inheritance 
-- from his corpus of facts about first cursor movements units,
and then about words.

Anyway, based on what we discussed above,
Case (I) is way too difficult (for people, to say nothing of machines), and
Case (III) is fairly trivial.
This leaves us with Case (II) --
which is the type of situation on which this research will focus.
Reiterating: we assume the ES already knows some (but not all) of the relevant
first principles about the domain being investigated (in this case, editors).
Furthermore, it must have a non-trivial collection of known particular facts
which can serve as appropriate analogues for new objects.  (Here, knowing about
character movements was essential to begin explaining the function of the
word movement commands.)
Finally, that KA front-end part of ES must have some idea how to draw up and
use the analogies provided.
@AppendixSec(Other cases)

It is not immediately obvious how common this Case (II) is.
To demonstrate that these constraints are not that imposing, or unusual,
this subsection will briefly list a few other instances of this type
of situation.
[Note this corresponds nicely with the second phase of the KA task
(given in @Cite(SACON)).  Here the evolving knowledge base has a basic
(if slightly erroneous) outline of the domain,
and much of the essential vocabulary.]

@BEGIN(Enumerate)
@B(Geology:)@*
@i{Known Domain Details:} The existent expert system LITHOS knows about the
	Saudi Arabian peninsula.@*
@i{Core Facts:} Imagine it also had some body of core facts about
geology in general.@*
@i{New Domain Details:} Using those first principles, explain to LITHOS that  
	western Europe is just like the Arabian peninsula, EXCEPT ...

@B(Chemistry:)@*
@i{Known Domain Details:} Imagine Dendral currently knows about hydrates.
	[R-C(OH)2-H].@*
@i{Core Facts:} Imagine it knew enough chemistry to realize that amines were like
	hydroxyl groups, and to deduce just how they were similar and how
	different.@*
@i{New Domain Details:} One could then propose teaching Dendral about
	ammoniates [R-C(NH2)2-H].

@B(Music:)@*
@i{Known Domain Details:} Ability to play a stringed instrument - here the violin.@*
@i{Core Facts:} Basic physics (vibrating string), crude knowledge of instrument
	design (adjacent (open) strings would be harmonious), music facts
	(including harmony...)@*
@i{New Domain Details:} Learning to play the Viola -- or even Piano or Recorder.

@B(Programming:)
	From InterLisp to MacLisp - or even Fortran
	(or, so that's an iteration, or pass by reference)
@END(Enumerate)
@Appendix(Starting Set of Rules)

@Tag(Rules)
The below rules are included in the starting RULES Knowledge Base.
Their rule interpreter is an EMYCIN like inference engine, which
back-chains from a goal description down to primitive operations.

(The commands are written in the form
@BEGIN(Example)
	To do X, execute Y.
@END(Example)
This is easily translated into the standard
@BEGIN(Example)
	IF you want to achieve X, THEN Execute Y.)
@END(Example)

@BEGIN(Example)
To move forward a single character,
  Execute the command: C-F.

To execute the <X> command N times in a row,
  Execute <X> M times in a row, and then Execute <X> P times in a row,
	WHERE M+P = N.

To execute the <X> command 4 times in a row,
  Execute the command: C-U <X>.

To advance in the file,
  Execute a cursor movement command.

To Execute a cursor movement,
	WHERE the absolute position of the destination is known,
  Execute an absolute move command.

To Execute a cursor movement,
	WHERE the relative position of the destination, (wrt current pos'n,) is known,
  Execute an incremental move command.

To Execute a direct move command,
	WHERE that new position is in the current line,
  THEN move forward (some number of) characters.

@END(Example)

A rule which will be Learned:

@BEGIN(Example)
To move forward (some number of) characters,
	WHEN there is a word boundary before that destination,
  Execute the command: M-F.
(There are now one fewer word boundaries between your position and the destination.)
@END(Example)
@Appendix<(Starting) Description of Editors>

@Tag(EditorFacts)
Part of the EDITOR Knowledge Base.
We list below first a vocabulary, and then a list of operations.
Consider how you would fit your favorite editor into this framework.
(Note match probably not exact -- eg where is multiple screens - or for that
matter, buffers?  Not in E.)

Vocubulary:
Character, String, (Word?)
Forward, Backward, (Up, down, previous next)
Position
Search, substitute
Delete, Insert
Move, Copy

@BEGIN(Example)
Operations
I. Cursor movement
  A. Direct, incremental
    (Forward vs Backward)
    (Char [F&B], Line [U&D], Page [Next, Previous]
  B. Direct, Absolute
    (Char, Line, Page)
	1. Goto absolute poition
	2. Return to prior position 
  C. Indirect
    (Forward or Backward - or wrap around)
    (Restricted vs Unrestricted)
	1. Search

II. Text Modification
  A. Local [wrt Cursor]
    (Char vs string of character)
	1. Insert
	2. Delete
	3. Overwrite
  B. Global
    (Full file vs Rest of file)
	1. Substitution
	2. (Massive) Text deletion
	3. Text movement

III. Interface to Operating System
  A. File Management
	1. Enterring a file
	2. Saving current image
	3. Exiting
	4. Restoring previous version
  B. Other Forks
	1. (Returning to) superior fork
	2. Calling an inferior fork
	  a. and dumping user there
	  b. and doing something there

IV. Commands which effect other commands
  A. Local effects (effecting only single command)
	1. Repeat (Do this many times)
	  a. Absolute value
	  b. Until condition met
	2. Undo prior command
	3. Doc vs Execute (short term mode)
  B. Global effects (from now on)
	1. Macros
	2. Modes
	3. User initialization package
@END(Example)

------
@BEGIN(Example)
		Files
		/|\
	Window	Mark	Paragraphs

    Lines	Sentences	S-expressions
	
		Word

		Character

[note logical vs physical]
@END(Example)
@Appendix(Evaluating Body of Test Cases).
@Tag(TestBed)

Validating ES's knowledge will involve giving the system some
task to solve, and watching its behaviour.
(Example task: Given a file containing xxx, find the 3rd occurence of yyy
following the cursor, which is now at position zzz.)